Quantifying partisan news diets in Web and TV audiences

Partisan segregation within the news audience buffers many Americans from countervailing political views, posing a risk to democracy. Empirical studies of the online media ecosystem suggest that only a small minority of Americans, driven by a mix of demand and algorithms, are siloed according to their political ideology. However, such research omits the comparatively larger television audience and often ignores temporal dynamics underlying news consumption. By analyzing billions of browsing and viewing events between 2016 and 2019, with a novel framework for measuring partisan audiences, we first estimate that 17% of Americans are partisan-segregated through television versus roughly 4% online. Second, television news consumers are several times more likely to maintain their partisan news diets month-over-month. Third, TV viewers’ news diets are far more concentrated on preferred sources. Last, partisan news channels’ audiences are growing even as the TV news audience is shrinking. Our results suggest that television is the top driver of partisan audience segregation among Americans.

), this does not imply homogeneity of bias in Spanish language news or speak to the average partisan bias of Spanish-language news as a whole. Lastly, we relabel news programming from 'Fox Business' as being from the Fox News Channel due to programming similarity. The remaining cable channels are CNN, Fox News, and MSNBC, which are also used in Figures 1 and 2. This source-categorization schema is visualized Table S3.

Web News Operationalization
To classify web-browsing behavior as 'news consumption' or not, we first apply Nielsen's categorization schema at the domain level. News 'domain' refers to the top-level URL feature of a news website. E.g., the domain for any New York Times article hosted by the New York Times' own website is nytimes.com. More than three thousand unique web domains (out of all web domains accessed by participants) were categorized by Nielsen as being news providers. These websites include well-known publishers (e.g., nytimes.com, huffpo.com), cable-news online (e.g., Foxnews.com, cnn.com), broadcast affiliates online (e.g., abc7.com, nbcLosAngeles.com), small local publications (e.g., lowellsun.com, omaha.com), specific political outlets (e.g., redstate.com, commondreams.org), government institutions (e.g., state.gov, senate.gov), news aggregators (e.g, news.aol.com, news.google.com), and non-political news coverage such as sports, gossip, finance, weather, and tech news. We manually removed news sites that are explicitly or exclusively non-political in their primary content. We note also that our strategy automatically captures news websites that were arrived at via social media, but not political content consumed directly on social media or video streaming websites. We know from prior research that approximately 6.4% of news consumption occurs on smartphones, which we do not analyze here (28). (This percentage is computed using To assign partisan bias labels to news domains, we draw on audience behavior. Specifically, we make use of (18), which provides a list of domains scored between 0 (extreme left) and 1 (extreme right) according to large-scale sharing behavior on Twitter. The scores are based on proportions of left-and right-leaning Twitter users who shared links from each domain during a sampling period in 2018. The intersection of this set and the news domains encountered in the Nielsen panel is 1,718 domains, which together account for more than 95% of online news consumption time in our panel. For our analysis, we reduce the scoring system provided by (18) to an ordinal ranking system. The source categorization schema is shown in Table S4.  Figure S1 is a histogram showing the resulting distribution of unique websites across a spectrum of bias, as well as locations of bias thresholds used in the main paper. Note that this histogram shows the count of unique domains at each bias score level, not the frequency of visits or visitors. We reduce the continuous scoring system visualized here to an ordinal ranking system. To prove this point further, we also compare our domain ranking against those established by Bakshy et al.

Group
(2015) (8). These scores were based on the URL sharing patterns of Facebook users who identified with a political party, seven years prior to the current analysis. This Facebook-derived set includes 500 domains. We again find the intersection between this set and our preferred set, preserve the rankings of either source, and calculate the Pearson correlation coefficient between those rankings. This comparison yielded a correlation coefficient of 0.962, a remarkably high correlation. Hence, our chosen domain-ranking system is not only corroborated by a separate analysis on the same data source (versus Eady et al., (2018) (11)), but also by an analysis using an alternative data source, Facebook.

Delineating bias in news content
The prior section explained the creation of ordinal rankings of online news partisanship. This section now describes how we identify whole categories of online news using this ranking, along with our analogous approach to sorting television news programming.
Theoretically, partisan bias is a continuous variable, but one that is essentially impossible to quantify precisely in a universally satisfying way; partisan-ideological sorting is imperfect, positions change, and interpretations are not concrete. However, as the focus of this paper is the extent to which individuals are surrounded by clearly partisan content, we leave genuine partisan extremity as a latent variable, and classify news content in discrete terms: news content is either partisan or not, and either left or right. To do this, we first identify familiar online news content that is broadly reputed to present partisan bias, and set that content as the threshold between 'partisan' and 'not partisan.' Among news sites, we do this based on the ordinal ranking of domains established by Robertson et al. (18). For television news, we do not try to map the various news programs to a continuum, or to systematically characterize subtle ideological differences between them. We instead utilize the conventional understanding that Fox News and Fox Business are furthest right among large-scale television news networks, MSNBC is furthest left, and the rest of the channels, including all of the major broadcast networks, adhere more closely to a centrist approach. The one major television news source we concede as having arguable status is the remaining large-scale cable news network, CNN. We demonstrate the effect of including CNN as a left or centrist news network in Figure 1 Panel A and B respectively. Endogenous clustering of television news program viewership recreated the major conventional groups: major cable channels, hard broadcast, soft broadcast.
In contrast to television news, news websites are not aggregated by channels, and so bias categorization is done at the level of the individual news website. On the left, we identify slate.com as the stringent boundary for the left, and theguardian.com as the lenient boundary for the left. On the right, we identify breitbart.com as the stringent boundary for the right, and Foxnews.com as the lenient boundary for the right. Hence, any news websites with a bias ranking beyond these boundaries are considered to be partisan. This should not be interpreted to mean that all rightward-biased television news programming is only as biased as the most leniently right-biased news domain we include, Foxnews.com. Given that our bias rankings for web domains were based on audience measures, the long thin tail of right-leaning web domains nudges Foxnews.com empirically leftward. Conceptually, this same rank ordering would apply within Fox News, which is not crowded-out with small competitors, such that the most left-leaning Fox News program serves as our boundary for rightwardly-biased news, and a long tail of Fox News programs exists to the right of that boundary.

Setting and varying the minimum threshold of time spent consuming news
In the main text, identification as a news consumer is based on the amount of time in a month that an individual spends consuming news. In the lenient operationalization common in the literature, anybody who watches any amount of television news in a given month is considered a news consumer, and likewise for desktop news consumption. This approach raises three practical concerns. First, an individual consuming a very small duration of news has less opportunity to diversify their news diet, and is thus statistically biased toward being counted as partisan segregated. Second, an individual consuming a very small duration of news is more likely to have accidentally encountered the news content (i.e., by channel surfing or following a link just once) and thus is not actually an active news consumer. Third, individuals who only consume a very short duration of news are not the core population of politically minded individuals with dangerously skewed news diets. As such, we set our stringent bound of television news consumption at thirty minutes per month, which represents a single thirty-minute news program in a month, or roughly one minute of television news per day. For online news consumption, we set this threshold to two minutes per month. This monthly threshold is proportionate to our thirty-minute monthly threshold for television consumption, based on the average amount of time that Americans spend consuming news from either platform. In Figure S2, Figure   1 from the main text is recreated without setting minimum thresholds of news consumption. That is, anybody who consumes just a moment of news online or on TV is counted as a news consumer, and hence may be counted as experiencing partisan segregation.   panelists may have multiple membership sessions throughout their participation in the panel, separated by one or more months. By design, the Kaplan-Meier estimator accounts for right-censoring, which is crucial for analyzing behavior within the rotating panels. Left-censoring, which is theoretically symmetrical to right-censoring in our panel, was not directly addressed in this paper, but may be an avenue for future methodological research. The authors considered removing panelists whose first panel month was also the start of a partisan segregation session, as we cannot determine their news diets in preceding months. However, doing so would bias our results by targeting individuals with longer or more frequent partisan segregation sessions, as these individuals would be more likely to face this leftward censoring by definition. Hence, in these cases, the panelist's first month of panel membership is coded as m=1 for partisan segregation by assumption. Bootstrap-based confidence intervals calculated for the point estimates in Figure   2 were small enough to be indiscernible. Table S6 provides the data underlying Figure 2, as well as the same values expressed as the percentage of Americans.

Table S6. Survival analysis of news audiences with left-biased or right-biased news diets via TV and online.
The four columns on the left show likelihood estimates underlying Figure 2 in the main text. The four columns on the right reframe these likelihood scores in terms of percent of Americans. E.g., in expectation, 14% of all left-biased TV news consumers maintain a left-biased news diet for 12 consecutive months. Based on the number of Americans with leftbiased TV news diets, this is roughly equivalent to 1.22% of Americans. As with Figure 2 in the main text, Table S6 follows a lenient approach to parameter 1 (news diet composition) and parameter 2 (news partisanship).

Partisan Segregation Churn & Time Scales
The individual-level analysis in Figure 2 is distinct from, but closely related to, the aggregate-level phenomenon of individuals rotating into and out of experiencing partisan segregation across time periods. To measure the aggregate level of churn of partisan segregation, along with the robustness of our findings to aggregation units larger than one month, we calculate the average remain rates of partisan segregation from one time period to the next using 4 aggregation units: 1 month (as used throughout the main text), 2 months, 6 months, and 12 months. Figure S3 shows the results of the procedure described below.

Identifying Archetypes of News consumption
To identify archetypes of news consumption via either web or TV, the first step is to identify categories of news on either platform, rather than clustering the vast number of television programs and websites. Television news programming was sorted into seven categories following from their natural grouping into channels, as laid out in Table   S3. For desktop news consumption, we create these categories as follows. First, we treat portal websites as a singular category (e.g., yahoo.com, news.google.com) due to their structural similarity and similar mainstream appeal. We then divide the remaining websites into five categories (furthest-left, left, mainstream, right, furthest-right) based on the websites previously used to bound our lenient and stringent definitions of partisan bias.
Independently for either panel, these content categories are treated as independent dimensions for each panelist, measured by minutes spent consuming each category of content in an average month. Thus, each television panelist is assigned a seven-dimensional consumption vector, and a six-dimensional consumption vector is assigned to each desktop panelist. In either panel, we find the cosine similarity between every pair of panelists. This process creates two complete graphs, one for each panel, in which the nodes represent panelists and the edges that connect them are weighted according to the pairwise similarity of news diets. The two complete graphs are then pruned by removing any edges weighted below 0.97, while the remaining edges become unweighted. Outcomes are robust to moderate variations in threshold selection. Finally, in either graph, we run a Louvain community detection algorithm to identify communities of similar news consumers (including the large group of non-news consumers, omitted from

Growth and shrinkage of archetype popularity over time
In Figure S4, we illustrate the dynamics of all TV news consumption archetypes between 2016 and 2019. As the popularity of broadcast news has decreased over this period, the size of cable news audiences (for CNN, Fox News, MSNBC) has risen. Figure S4 also demonstrates the large increase in the share of Americans who do not watch news on television; this has been on the rise since a local minimum in the 2016 election. In purple, we show the number of Americans who are not television news consumers but have access to television, based on the Nielsen company's estimation of the market, and empirically, individuals' inclusion in the Nielsen panel data. Figure S5 illustrates change in the web archetypes, using the same format.  All panelists in the data are assigned to a single archetype, or the "No or Minimal News" group. In purple, we show the difference between the entire adult U.S. population and the number of news consumers identified in our data. Figure 4 in the main text illustrates the flow between archetypes of television news consumption over the four-year duration of our panel. In Figure S6 below, we create the same plot for web consumption archetypes. As with television news consumption, we see that the strongest attractor is a move away from consuming news. Other net dynamics are relatively small, suggesting that no archetype has gained or lost large numbers of Americans. This low net flowwhich should not be confused for stability or lack of exchange between archetypes-is in line with the findings of That is, the temporal instability of online partisan segregation and web news archetypes is noisy enough to drown out possible macro-trends in Americans' news diets. Figure S6: The "net flow" of people between pairs of nine television news archetypes. These are the eight television news archetypes seen in Figure 3 Panel 2, each labeled according to the category of news most prominently consumed, and a 9th group of people who are exposed to less than 2 minutes per news in a month, over the 4 years of analysis. Net flow represents the direction and magnitude of turnover between a pair of archetypes. Specifically, if we let be the set of people in archetype i during month k, the "net flow" between archetypes i and j is defined as the absolute value of the expression | − +1 | − | − +1 | summed over all pairs of months (k,k+1). The direction of the net flow, signified by the arrows, points toward group j if net flow, before absolute value, is positive, and toward group i if it is negative. We do not show net flows of less than 1M people. Node diameter corresponds to the size of the population of the archetypal cluster averaged over all months. Green signifies that an archetype has experienced net inflow, while blue signifies net outflow, with alpha levels corresponding to the scale of net in[out] flow.